Search CORE

1,154 research outputs found

Fitting multiplicative models by robust alternating regressions.

Author: Croux Christophe
Filzmoser P
Pison G
Rousseeuw Peter
Publication venue
Publication date
Field of study

In this paper a robust approach for fitting multiplicative models is presented. Focus is on the factor analysis model, where we will estimate factor loadings and scores by a robust alternating regression algorithm. The approach is highly robust, and also works well when there are more variables than observations. The technique yields a robust biplot, depicting the interaction structure between individuals and variables. This biplot is not predetermined by outliers, which can be retrieved from the residual plot. Also provided is an accompanying robust R-2-plot to determine the appropriate number of factors. The approach is illustrated by real and artificial examples and compared with factor analysis based on robust covariance matrix estimators. The same estimation technique can fit models with both additive and multiplicative effects (FANOVA models) to two-way tables, thereby extending the median polish technique.Alternating regression; Approximation; Biplot; Covariance; Dispersion matrices; Effects; Estimator; Exploratory data analysis; Factor analysis; Factors; FANOVA; Least-squares; Matrix; Median polish; Model; Models; Outliers; Principal components; Robustness; Structure; Two-way table; Variables; Yield;

Research Papers in Economics

Robust Estimation with Discrete Explanatory Variables

Author: FR Hampel
M Hubert
M Orhan
P Čížek
PJ Rousseeuw
PJ Rousseeuw
PJ Rousseeuw
Publication venue: Humboldt-Universität zu Berlin, Wirtschaftswissenschaftliche Fakultät
Publication date: 01/01/2002
Field of study

Crossref

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Outlier Detection Using Nonconvex Penalized Regression

Author: Art B. Owen
Benjamini Y.
Hadi A. S.
Peña D.
Rousseeuw P.
Yiyuan She
Zhao P.
Publication venue
Publication date: 01/01/2010
Field of study

This paper studies the outlier detection problem from the point of view of penalized regressions. Our regression model adds one mean shift parameter for each of the

n

data points. We then apply a regularization favoring a sparse vector of mean shift parameters. The usual

L_1

penalty yields a convex criterion, but we find that it fails to deliver a robust estimator. The

L_1

penalty corresponds to soft thresholding. We introduce a thresholding (denoted by

\Theta

) based iterative procedure for outlier detection (

\Theta

-IPOD). A version based on hard thresholding correctly identifies outliers on some hard test problems. We find that

\Theta

-IPOD is much faster than iteratively reweighted least squares for large data because each iteration costs at most

O(np)

(and sometimes much less) avoiding an

O(np^2)

least squares estimate. We describe the connection between

\Theta

-IPOD and

M

-estimators. Our proposed method has one tuning parameter with which to both identify outliers and estimate regression coefficients. A data-dependent choice can be made based on BIC. The tuned

\Theta

-IPOD shows outstanding performance in identifying outliers in various situations in comparison to other existing approaches. This methodology extends to high-dimensional modeling with

p\gg n

, if both the coefficient vector and the outlier pattern are sparse

arXiv.org e-Print Archive

CiteSeerX

Crossref

Research Papers in Economics

Predicting deadline transgressions using event logs

Author: A.K. Jallow
J.A. Wickboldt
M. Jans
P. Zhang
P.J. Rousseeuw
W.M.P. Aalst van der
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Effective risk management is crucial for any organisation. One of its key steps is risk identification, but few tools exist to support this process. Here we present a method for the automatic discovery of a particular type of process-related risk, the danger of deadline transgressions or overruns, based on the analysis of event logs. We define a set of time-related process risk indicators, i.e., patterns observable in event logs that highlight the likelihood of an overrun, and then show how instances of these patterns can be identified automatically using statistical principles. To demonstrate its feasibility, the approach has been implemented as a plug-in module to the process mining framework ProM and tested using an event log from a Dutch financial institution

CiteSeerX

Repository TU/e

Crossref

Queensland University of Technology ePrints Archive

Exploring Outliers in Crowdsourced Ranking for QoE

Author: Barnett V.
Foucart S.
Gardlo B.
Johnson T.
Knorr E.
Leroy A.
Rousseeuw P.
Schatz R.
Publication venue
Publication date: 18/07/2017
Field of study

Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years. In this paper, we propose some simple and fast algorithms for outlier detection and robust QoE evaluation based on the nonconvex optimization principle. Several iterative procedures are designed with or without knowing the number of outliers in samples. Theoretical analysis is given to show that such procedures can reach statistically good estimates under mild conditions. Finally, experimental results with simulated and real-world crowdsourcing datasets show that the proposed algorithms could produce similar performance to Huber-LASSO approach in robust ranking, yet with nearly 8 or 90 times speed-up, without or with a prior knowledge on the sparsity size of outliers, respectively. Therefore the proposed methodology provides us a set of helpful tools for robust QoE evaluation with crowdsourcing data.Comment: accepted by ACM Multimedia 2017 (Oral presentation). arXiv admin note: text overlap with arXiv:1407.763

arXiv.org e-Print Archive

Crossref

A COMPARISON OF METHODS FOR SELECTING PREFERRED SOLUTIONS IN MULTIOBJECTIVE DECISION MAKING

Author: E. Zio
ICRP Publication 60
J. Branke
J.E. Yang
P. Giuggioli Busacca
P. Rousseeuw
P.J. Rousseeuw
S. Chiu
S. Martorell
X. Blasco
Z. Li
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/11/2012
Field of study

ISBN : 978-94-91216-77-0In multiobjective optimization problems, the identified Pareto Frontiers and Sets often contain too many solutions, which make it difficult for the decision maker to select a preferred alternative. To facilitate the selection task, decision making support tools can be used in different instances of the multiobjective optimization search to introduce preferences on the objectives or to give a condensed representation of the solutions on the Pareto Frontier, so as to offer to the decision maker a manageable picture of the solution alternatives. This paper presents a comparison of some a priori and a posteriori decision making support methods, aimed at aiding the decision maker in the selection of the preferred solutions. The considered methods are compared with respect to their application to a case study concerning the optimization of the test intervals of the components of a safety system of a nuclear power plant. The engine for the multiobjective optimization search is based on genetic algorithms

HAL-CentraleSupelec

Crossref

Robust high-dimensional precision matrix estimation

Author: B. Bertsekas
C. Croux
C. Spearman
E. Ollila
E. Ollila
F.A. Alqallaf
G. Tarr
G.A.F. Seber
J. Friedman
K. Boudt
M. Yuan
M.A. Finegold
N. Blomqvist
N.J. Higham
O. Banerjee
P. Bühlmann
P.J. Rousseeuw
P.J. Rousseeuw
R. Gnanadesikan
R.A. Maronna
R.A. Maronna
S. Aelst Van
S. Visuri
T. Zhao
T.T. Cai
Publication venue
Publication date: 01/01/2015
Field of study

The dependency structure of multivariate data can be analyzed using the covariance matrix

\Sigma

. In many fields the precision matrix

\Sigma^{-1}

is even more informative. As the sample covariance estimator is singular in high-dimensions, it cannot be used to obtain a precision matrix estimator. A popular high-dimensional estimator is the graphical lasso, but it lacks robustness. We consider the high-dimensional independent contamination model. Here, even a small percentage of contaminated cells in the data matrix may lead to a high percentage of contaminated rows. Downweighting entire observations, which is done by traditional robust procedures, would then results in a loss of information. In this paper, we formally prove that replacing the sample covariance matrix in the graphical lasso with an elementwise robust covariance matrix leads to an elementwise robust, sparse precision matrix estimator computable in high-dimensions. Examples of such elementwise robust covariance estimators are given. The final precision matrix estimator is positive definite, has a high breakdown point under elementwise contamination and can be computed fast

arXiv.org e-Print Archive

Crossref

Gauge fields, ripples and wrinkles in graphene layers

Author: B. Schölkopf
D. Hawkins
D. Yu
F. Angiulli
J. Tang
L.J. Cao
M.F. Jiang
P. Rousseeuw
R. Nuts
S. Hawkins
S. Papadimitriou
V. Barnett
Z. He
Z. He
Publication venue
Publication date: 01/01/2005
Field of study

We analyze elastic deformations of graphene sheets which lead to effective gauge fields acting on the charge carriers. Corrugations in the substrate induce stresses, which, in turn, can give rise to mechanical instabilities and the formation of wrinkles. Similar effects may take place in suspended graphene samples under tension.Comment: contribution to the special issue of Solid State Communications on graphen

arXiv.org e-Print Archive

Crossref

An integrative clustering approach combining particle swarm optimization and formal concept analysis

Author: A. Alizadeh
A. Brazma
E. Tsiporkova
G. Rustici
J. Besson
J. Besson
J. Handl
J. Kennedy
J.K. Choi
M. Kaytoue-Uberall
P. Rousseeuw
S. Maere
T. Golub
V. Boeva
V. Choi
Zhou
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

Crossref

Ghent University Academic Bibliography

Comment on “TheIDVindex: Its derivation and use in inferring long-term variations of the interplanetary magnetic field strength” by Leif Svalgaard and Edward W. Cliver

Author: A. P. Rouillard
Arge
I. Finch
Lockwood
Lockwood
Lockwood
Lockwood
Lockwood
M. Lockwood
R. Stamper
Rousseeuw
Rousseeuw
Sokal
Stamper
Svalgaard
van Storch
Vasyliunas
Wang
Wang
Wang
Wang
Wang
Whang
Wilkes
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/01/2006
Field of study

Central Archive at the University of Reading

Crossref